Use this space to include your installation screenshots.
Detail the code you used to create, initialize, and push your portfolio repo to GitHub. This will be helpful as you will need to repeat many of these steps to update your porfolio throughout the course.
$ mkdir MICB425_portfolio #make portfolio directory within desired directory
$ cd MICB425_portfolio #go to new directory
$ git init #designate it as a repo
$ touch ID.txt #create blank ID.txt file
$ git add . #stage all files in new repo for commit
$ git commit -m "First commit" #commit files
$ git remote add origin https://github.com/ryankn/MICB425_portfolio #designate remote repo URL
$ git remove -v #verify remote repo URL
$ git push -u origin master #push local repo to remote repo
The following is from the activity of recreating the example PDF, with the header levels changed such that they won’t appear in the table of contents.
The following assignment is an exercise for the reproduction of this .html document using the RStudio and RMarkdown tools we’ve shown you in class. Hopefully by the end of this, you won’t feel at all the way this poor PhD student does. We’re here to help, and when it comes to R, the internet is a really valuable resource. This open-source program has all kinds of tutorials online.
http://phdcomics.com/ Comic posted 1-17-2018
The goal of this R Markdown html challenge is to give you an opportunity to play with a bunch of different RMarkdown formatting. Consider it a chance to flex your RMarkdown muscles. Your goal is to write your own RMarkdown that rebuilds this html document as close to the original as possible. So, yes, this means you get to copy my irreverant tone exactly in your own Markdowns. It’s a little window into my psyche. Enjoy =)
hint: go to the PhD Comics website to see if you can find the image above
If you can’t find that exact image, just find a comparable image from the PhD Comics website and include it in your markdown
Let’s be honest, this header is a little arbitrary. But show me that you can reproduce headers with different levels please. This is a level 3 header, for your reference (you can most easily tell this from the table of contents).
Another header, now with maths
Perhaps you’re already really confused by the whole markdown thing. Maybe you’re so confused that you’ve forgotton how to add. Never fear! A calculator R is here:
1231521+12341556280987
## [1] 1.234156e+13
Or maybe, after you’ve added those numbers, you feel like it’s about time for a table!
I’m going to leave all the guts of the coding here so you can see how libraries (R packages) are loaded into R (more on that later). It’s not terribly pretty, but it hints at how R works and how you will use it in the future. The summary function used below is a nice data exploration function that you may use in the future.
library(knitr)
kable(summary(cars),caption="I made this table with kable in the knitr package library")
| speed | dist | |
|---|---|---|
| Min. : 4.0 | Min. : 2.00 | |
| 1st Qu.:12.0 | 1st Qu.: 26.00 | |
| Median :15.0 | Median : 36.00 | |
| Mean :15.4 | Mean : 42.98 | |
| 3rd Qu.:19.0 | 3rd Qu.: 56.00 | |
| Max. :25.0 | Max. :120.00 |
And now you’ve almost finished your first RMarkdown! Feeling excited? We are! In fact, we’re so excited that maybe we need a big finale eh? Here’s ours! Include a fun gif of your choice!
R code from work for Data Science Friday on 26 Jan 18.
#Libraries
#install.packages("tidyverse")
library("tidyverse")
## -- Attaching packages ------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 2.2.1 v purrr 0.2.4
## v tibble 1.4.2 v dplyr 0.7.4
## v tidyr 0.8.0 v stringr 1.2.0
## v readr 1.1.1 v forcats 0.2.0
## -- Conflicts ---------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
#Data Import
metadata <- read.table(file="DS_Friday/26Jan18/Saanich.metadata.txt", header=TRUE, row.names=1, sep="\t", na.strings="NAN")
#Exercise 1
OTU <- read.table(file="DS_Friday/26Jan18/Saanich.OTU.txt", header=TRUE, row.names=1, sep="\t", na.strings="NAN")
#Exercise 2
metadata %>% rownames_to_column('sample') %>%
filter(CH4_nM >= 100 & Temperature_C <= 10) %>%
column_to_rownames('sample') %>%
select(Depth_m,CH4_nM,Temperature_C)
## Depth_m CH4_nM Temperature_C
## SI072_S3_185 185 310.068 9.091
## SI072_S3_200 200 774.034 9.117
newtable <-
metadata %>% rownames_to_column('sample') %>%
select(matches("nM|sample")) %>%
mutate(N2O_uM = N2O_nM/1000, Std_N2O_uM = Std_N2O_nM/1000, CH4_uM = CH4_nM/1000, Std_CH4_uM = Std_CH4_nM/1000) %>%
column_to_rownames('sample')
#Exercise 3
metadata %>% rownames_to_column('sample') %>%
select(matches("nM|sample")) %>%
mutate(N2O_uM = N2O_nM/1000, Std_N2O_uM = Std_N2O_nM/1000, CH4_uM = CH4_nM/1000, Std_CH4_uM = Std_CH4_nM/1000) %>%
column_to_rownames('sample')
## N2O_nM Std_N2O_nM CH4_nM Std_CH4_nM N2O_uM Std_N2O_uM
## SI072_S3_010 0.849 0.114 1030.478 3.070 0.000849 0.000114
## SI072_S3_020 13.199 0.000 29.012 0.000 0.013199 0.000000
## SI072_S3_040 12.829 1.509 37.146 2.695 0.012829 0.001509
## SI072_S3_060 12.306 0.524 36.501 3.521 0.012306 0.000524
## SI072_S3_075 13.896 1.417 24.013 0.435 0.013896 0.001417
## SI072_S3_085 12.959 0.955 7.376 0.029 0.012959 0.000955
## SI072_S3_090 15.551 1.417 4.190 0.159 0.015551 0.001417
## SI072_S3_097 18.682 1.628 3.991 0.759 0.018682 0.001628
## SI072_S3_100 18.087 1.275 3.231 0.392 0.018087 0.001275
## SI072_S3_110 15.843 1.953 3.633 0.127 0.015843 0.001953
## SI072_S3_120 16.304 1.085 3.463 0.519 0.016304 0.001085
## SI072_S3_135 12.909 2.577 4.815 0.658 0.012909 0.002577
## SI072_S3_150 11.815 0.000 8.323 0.000 0.011815 0.000000
## SI072_S3_165 6.310 0.732 23.831 2.291 0.006310 0.000732
## SI072_S3_185 0.000 0.000 310.068 0.000 0.000000 0.000000
## SI072_S3_200 0.000 0.000 774.034 12.745 0.000000 0.000000
## CH4_uM Std_CH4_uM
## SI072_S3_010 1.030478 0.003070
## SI072_S3_020 0.029012 0.000000
## SI072_S3_040 0.037146 0.002695
## SI072_S3_060 0.036501 0.003521
## SI072_S3_075 0.024013 0.000435
## SI072_S3_085 0.007376 0.000029
## SI072_S3_090 0.004190 0.000159
## SI072_S3_097 0.003991 0.000759
## SI072_S3_100 0.003231 0.000392
## SI072_S3_110 0.003633 0.000127
## SI072_S3_120 0.003463 0.000519
## SI072_S3_135 0.004815 0.000658
## SI072_S3_150 0.008323 0.000000
## SI072_S3_165 0.023831 0.002291
## SI072_S3_185 0.310068 0.000000
## SI072_S3_200 0.774034 0.012745
R code for Data Science Friday assignment due Friday 16 Feb 18.
#Package Installation
#install.packages("tidyverse")
#source("https://bioconductor.org/biocLite.R")
#biocLite("phyloseq")
#Libraries
library("tidyverse")
library("phyloseq")
#Data Import
new_OTUs <-
read.table("DS_Friday/Assignment20180208/Saanich.OTU.new.txt",
header = TRUE, sep = "\t", row.names = 1, na.strings = "NAN")
new_metadata <-
read.table("DS_Friday/Assignment20180208/Saanich.metadata.new.txt",
header = TRUE, sep = "\t", row.names = 1, na.strings = "NAN")
load("DS_Friday/Assignment20180208/phyloseq_object.RData")
#Exercise 1
ggplot(new_metadata, aes(x = CH4_nM, y = Depth_m)) +
geom_point(color = "purple", shape = 17)
#Exercise 2
new_metadata %>%
mutate(Temperature_F = Temperature_C * 9 / 5 + 32) %>%
ggplot(aes(x = Temperature_F, y = Depth_m)) +
geom_point()
#Exercise 3
physeq_percent = transform_sample_counts(physeq, function(x) 100 * x/sum(x))
plot_bar(physeq_percent, fill="Domain") +
geom_bar(aes(fill=Domain), stat="identity") +
labs(x = "Sample depth", y = "Relative abundance (%)", title = "Domains from 10 to 200 m in Saanich Inlet")
#Exercise 4
new_metadata %>%
select(matches("uM|depth"),-matches("Std"),-H2S_uM) %>%
gather(key = "Nutrient", value = "Concentration", -Depth_m) %>%
ggplot(., aes(x = Depth_m, y = Concentration)) +
geom_point() +
geom_line() +
facet_wrap( ~ Nutrient, scales = "free") +
theme(legend.position = "none") +
labs(x = "Depth (m)", y = expression(paste("Concentration (", mu, "M)")))
The first thing for any assignment should link(s) to any relevant literature (which should be included as full citations in a module references section below).
Describe the numerical abundance of microbial life in relation to ecology and biogeochemistry of Earth systems.
turnover time? N/P/C
What were the primary methodological approaches used?
To make calculation of such figures more plausible, the number of prokaryotes in three large habitats in which current knowledge suggests most prokaryotes reside in were examined, namely: aquatic environments, soil, and the subsurface. All numbers were used from previously published papers reporting various figures like CFU/mL counts, volume estimations, or C content.
Summarize the main results or findings.
The amount of prokaryotic N, P, and C is roughly 60-100% of the amount in plants.
Analysis of the subsurface prokaryotic community suggests the turnover time is extremely long, on the order of [].
Do new questions arise from the results?
How is the turnover time for the subsurface community so long? At that kind of turnover rate, can that still be constituted as life?
Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?
Terminology, what the units actually mean, was hard
Describe the numerical abundance of microbial life in relation to the ecology and biogeochemistry of Earth systems.
| Habitat | Abundance |
|---|---|
| Aquatic | 1.161 x 1029 |
| Soil | 2.556 x 1929 |
| Subsurface | 3.8 x 1030 |
4x104 cells/mL divided by 5x105 cells/mL = 8%
What is the difference between an autotroph, heterotroph, and a lithotroph based on information provided in the text?
autotroph fix inorganic carbon e.g. CO2 into biomass, heterotroph assimilate organic carbon, lithotroph consumes inorganic substrates
Based on information provided in the text and your knowledge of geography what is the deepest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this depth?
subsurface deep habitats, both terrestrial and marine terrestrial and marine: up to 4 km, limiting factor is temperature of 125 degrees C temperature changes about 22 C per km
marianas trench - how deep is it? 10.9 km
mount everest - 8.8 km is anything really alive up in the atmosphere at 77 km? that doesn’t seem likely - lack of nutrients or moisture, then there’s lots of UV radiation too, sketchy. Let’s say 20 km.
Based on estimates of prokaryotic habitat limitation, what is the vertical distance of the Earth’s biosphere measured in km?
Thus the vertical distance is about 24 km from top to bottom (tip of mount everest to 4-5 km under marianas trench)
How was annual cellular production of prokaryotes described in Table 7 column four determined? (Provide an example of the calculation)
Annual cellular production of prokaryotes was calculated based on literature values for population size and population turnover time in days. In the following example calculation, population size is P, turnover time is T, and annual cellular production is A.
\[A=P*\frac{365}{T}\] 3.6x1028 cells * 365 days / 16 turnovers = 8.2x1029 cells/year
0.72*4 = 2.88 Pg C per year
51 Pg C per year of productivity * 85% = 43 Pg C per year goes to upper 200m 43/2.88 = 14.9 turnovers a year 365/14.9 = 24.5 days per turnover
why does this vary with depth? different production and consumption of C in different habitats
Carbon assimilation efficiency and carbon content determine turnover rates in the upper 200m of the ocean. The amount of net primary productivity required to sustain prokaryotic turnover is dependent on both C assimilation efficiency and total carbon content of the population, which then sets an upper limit on turnover rates. These vary between habitats because different assimilation efficiencies and total carbon content, as well as the amount of total net primary productivity each habitat zone consumes.
also viruses - the viruses kill bugs causing turnover, and carry assessory metabolic genes that when they infect cells, supplement the various metabolic capacities of the community
4x10-7 mutations/generation
(4x10-7)4 = 2.56 x 10-26 mutations/generation
365/16 = 22.8 turnovers per year
3.1 x 1028 cells * 22.8 = 8.2x1029 cells/year
8.2x1029 cells/year x 2.56 x 10-26 mutations/generation = 2.1x104 mutations/year
convert to hours - divide by 365x24
2.1x104 / 365 / 24 = 2.4 mutations/hour
1/2.4 = 0.4 hours/mutation
Given the large population size and high mutation rate of prokaryotic cells, what are the implications with respect to genetic diversity and adaptive potential? Are point mutations the only way in which microbial genomes diversify and adapt?
What relationships can be inferred between prokaryotic abundance, diversity, and metabolic potential based on the information provided in the text?
Comment on the emergence of microbial life and the evolution of Earth systems.
Indicate the key events in the evolution of Earth systems at each approximate moment in the time series. If times need to be adjusted or added to the timeline to fully account for the development of Earth systems, please do so.
1.3 billion years ago
200,000 years ago First humans
Describe the dominant physical and chemical characteristics of Earth systems at the following waypoints:
Precambrian
Phanerozoic
Increased oxygenation of the atmosphere, mass extinction events from meteorite impacts
Discuss the role of microbial diversity and formation of coupled metabolism in driving global biogeochemical cycles.
Biogeochemical processes - H, C, N, O, S, and P fluxes
abiotic chemical processes tend to be based on acid/base reactions while biotic ones are based on redox. Reactions are nested, with abiotic processes providing e- acceptors that the biotic reactions use, as well as C, S, and P via tectonics, volcanism and weathering(?)
Why is Earth’s redox state considered an emergent property? Emergent property of microbial life on earth
How do reversible electron transfer reactions give rise to element and nutrient cycles at different ecological scales? What strategies do microbes use to overcome thermodynamic barriers to reversible electron flow?
Synergistic multi-species assemblage of the overall pathway
Using information provided in the text, describe how the nitrogen cycle partitions between different redox “niches” and microbial groups. Is there a relationship between the nitrogen cycle and climate change?
NH4 -> NO2 is a niche, NO2 -> NO3 is a niche; nitrification, typically involves CO2 fixation to organic matter NH4 + NO2 -> N2 (anammox) N2 -> NH4 reduction (N fixation) NO2 or NO3 -> NO -> N2O -> N2 (denitrification) and N2O can be released (greenhouse gas) NO3 -> NO2 -> NH4 (Dissimilatory nitrate reduction to ammonium, DNRA)
What is the relationship between microbial diversity and metabolic diversity and how does this relate to the discovery of new protein families from microbial community genomes?
On what basis do the authors consider microbes the guardians of metabolism?
Utilize this space to include a bibliography of any literature you want associated with this module. We recommend keeping this as the final header under each module.
Whitman WB, Coleman DC, Wiebe WJ. 1998. Prokaryotes: The unseen majority. Proc Natl Acad Sci U S A 95:6578-6583. PMC33863